Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Dockerfile remove redundant directives #3914

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

polarathene
Copy link

@polarathene polarathene commented Dec 28, 2024

  • Removes the /etc/nsswitch.conf workaround.
  • Removes the redundant usage of VOLUME directives.

Related issue(s)

The associated issues have been ignored for over a year. They've now been marked as stale, this PR attempts to address them for this repo.

See the issues for detailed justification of the summarized changes.

Checklist

  • I have read the contributing guidelines.
  • I have referenced an issue containing the design document if my change introduces a new feature.
  • I am following the contributing code guidelines.
  • I have read the security policy.
  • I confirm that this pull request does not address a security
    vulnerability. If this pull request addresses a security vulnerability, I
    confirm that I got the approval (please contact
    [email protected]) from the maintainers to push
    the changes.
  • I have added tests that prove my fix is effective or that my feature works.
  • I have added or changed the documentation.

Further Comments

These changes should be rather straight-forward. The bulk of the changeset is just noise repeating the same lines across several Dockerfiles, but I did notice inconsistencies which hint that these files may need to be revisited by someone more familiar with them.

Especially with the HSM image which appears to have had the runner stage broken since this June 2023 PR. though only earlier stages are built with the Makefile:

hydra/Makefile

Lines 89 to 96 in 8e71f91

.PHONY: quicktest-hsm
quicktest-hsm:
DOCKER_BUILDKIT=1 DOCKER_CONTENT_TRUST=1 docker build --progress=plain -f .docker/Dockerfile-hsm --target test-hsm -t oryd/hydra:${IMAGE_TAG} --target test-hsm .
.PHONY: test-refresh
test-refresh:
UPDATE_SNAPSHOTS=true go test -failfast -short -tags sqlite,sqlite_omit_load_extension ./...
DOCKER_BUILDKIT=1 DOCKER_CONTENT_TRUST=1 docker build --progress=plain -f .docker/Dockerfile-hsm --target test-refresh-hsm -t oryd/hydra:${IMAGE_TAG} --target test-refresh-hsm .

The HSM quickstart compose example should attempt to build the runner stage and fail:

dockerfile: .docker/Dockerfile-hsm

@CLAassistant
Copy link

CLAassistant commented Dec 28, 2024

CLA assistant check
All committers have signed the CLA.

@polarathene
Copy link
Author

Given the past associated issues were neglected, I am not too keen to invest more time than necessary.

I'll give the CLA a look and the other checklist items if this PR actually gets acknowledged with interest to merge it.


The scratch image seems redundant, you've got a replacement image now with the rough equivalent via Google distroless base image. You might as well drop it? The main difference apart from noted issues in the scratch Dockerfile is also a lack of the sqlite support.

Your alpine vs sqlite (alpine) images are effectively the same too, except for the sqlite package and slightly different default CMD. Probably no need or benefit maintaining the two with such a low distinction?

Copy link
Member

@aeneasr aeneasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the past associated issues were neglected, I am not too keen to invest more time than necessary.

No problem!

The scratch image seems redundant, you've got a replacement image now with the rough equivalent via Google distroless base image. You might as well drop it? The main difference apart from noted issues in the scratch Dockerfile is also a lack of the sqlite support.

Some dockerfiles are only used in the repo itself (for dev purposes) and some are being pushed to our docker registry. You can find the files used for prod distribution in the goreleaser config. Generally not too keen to deprecate image variants because someone always complains about it.

Your alpine vs sqlite (alpine) images are effectively the same too, except for the sqlite package and slightly different default CMD. Probably no need or benefit maintaining the two with such a low distinction?

That makes sense but please in another pr

.docker/Dockerfile-alpine Show resolved Hide resolved
Comment on lines +56 to +59
# NOTE: This is broken already. Even though this image provides a shell, you'd need to configure it with
# `SHELL ["/busybox/sh", "-c"]`, however `apt-get` does not exist either in a distroless image.
# This was original an Alpine image, the refactoring was not verified properly in this commit:
# https://github.com/ory/hydra/commit/c1e1a569621d88365dceee7372ca49ecd119f939#diff-ae54bef08e3587b28ad8e93eb253a9a5cd9ea6f4251977e35b88dc6b42329e25L31
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The HSM image is really just to run some e2e hsm tests. It's not being distributed and should not be used.

Copy link
Member

@aeneasr aeneasr Jan 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, e2e tests are now failing. Probably just revert the changes here

https://github.com/ory/hydra/actions/runs/12523819546/job/34933781514?pr=3914

@aeneasr aeneasr requested a review from Demonsthere January 2, 2025 10:19
@polarathene
Copy link
Author

polarathene commented Jan 5, 2025

Generally not too keen to deprecate image variants because someone always complains about it.

Since they're the same though, just use the same Dockerfile and publish it as two separate images? That should avoid your concern about dropping the original source file for an image variant?

Your alpine vs sqlite (alpine) images are effectively the same too, except for the sqlite package and slightly different default CMD. Probably no need or benefit maintaining the two with such a low distinction?

That makes sense but please in another pr

👍


The HSM image is really just to run some e2e hsm tests. It's not being distributed and should not be used.

No worries, just pointing out that it's presently broken since the switch, just AFAIK your tests don't use this final stage of the image so it's not getting caught by them.

Yeah, e2e tests are now failing. Probably just revert the changes here

https://github.com/ory/hydra/actions/runs/12523819546/job/34933781514?pr=3914

That is an easy fix, it just wants apt-get instead of apt? Or has something changed that it can't find the package?


EDIT: Actually I don't see where I made a change to affect that? 🤔 It's complaining about this part of your CI:

- name: Setup HSM libs and packages
run: |
sudo apt install -y softhsm opensc

test-hsm:
name: Run HSM tests
needs:
- sdk-generate
runs-on: ubuntu-latest

I'm surprised the image has been building for you? The softhsm package hasn't been supported since Ubuntu 16.04 (Xenial), it's since been packaged as part of softhsm2:

image


UPDATE 2: I've split this concern out to a separate issue, it's specific to the CI runner (Ubuntu) vs the images (Debian, where it's a non-issue).

@polarathene
Copy link
Author

polarathene commented Jan 5, 2025

TL;DR: Spotted a few more bugs/issues while looking into responding to your feedback.

EDIT: Extracted to separate issues, feel free to ignore the rest of this comment:

Registries and tags (publishing bug)

Some dockerfiles are only used in the repo itself (for dev purposes) and some are being pushed to our docker registry.

Is the DockerHub registry oryd/hydra no longer being used? (EDIT: Yes it is, just no new hydra release for approx a year, redundant images are not actually published as implied)

  • It has some tags for images as SHA256 digests with .sig extension that don't seem to be actual images, those shouldn't be there.
  • The last release was about a year ago with latest and v2.2.0 tags published. That seems to be the latest release on GH releases page too, so it seems like DockerHub is still a valid registry being maintained.
  • While I do see these images published as multi-platform images, I'm also seeing individual images published with their platform as part of the tag (eg: v2.2.0-arm64 + v2.2.0-arm64-distroless) which seems redundant?

Perhaps something is a bit iffy with your release process?

You can find the files used for prod distribution in the goreleaser config.

I'm not too familiar with GoReleaser, but I am with Github Actions equivalent for building and publishing images to DockerHub.

I only see two variables defined here:

hydra/.goreleaser.yml

Lines 11 to 12 in 8e5fae4

dockerfile_alpine: ".docker/Dockerfile-alpine"
dockerfile_static: ".docker/Dockerfile-distroless-static"

Which seems to align with your two variants on DockerHub -distroless tag suffix for Debian image with glibc and no package manager, while the version only tags are Alpine with musl and it's package manager.

I haven't done a thorough search for where the other Dockerfile variants are potentially being used (which may be internal for you, or referenced externally from this repo like in docs?), but unless there's another registry you had in mind it doesn't seem like your earlier concern about deprecating variants is applicable if they were never published to registries?

Your install docs for Hydra with Docker reference oryd/hydra but not a specific image registry, so that often defaults to DockerHub:

image

No mention there of the individual platform variants or the .sig digest tags. I'm going to assume that's a publishing bug with your GoReleaser config?

oryd/hydra default Alpine image vs -sqlite variant

Both this repo's Get Started and Develop sections refer to a different Hydra docs page for Docker which uses the quickstart.yml, and that likewise refers to oryd/hydra:

image: oryd/hydra:v2.2.0

Although the Develop section instructs make docker to build an image of the same tag locally instead of pulling from the registry, and that uses the -sqlite image variant Dockerfile:

hydra/Makefile

Lines 71 to 74 in 8e5fae4

# Build local docker images
.PHONY: docker
docker:
DOCKER_BUILDKIT=1 DOCKER_CONTENT_TRUST=1 docker build --progress=plain -f .docker/Dockerfile-build -t oryd/hydra:${IMAGE_TAG}-sqlite .

Which as we have discussed earlier -sqlite variant is effectively the Alpine image with the sqlite package (2.5 MB added weight uncompressed) and with CMD set to serve which will just output help message as it expects serve all / serve public / serve admin, meanwhile the published images default to serve all and this -sqlite image if built locally as oryd/hydra has the container command modified in quickstart.yml... so there's really no compelling reason to keep this Dockerfile around either.

Technically though, the quickstart.yml has a specific release tag (good), while the make docker command defaults IMAGE_TAG to latest if not specified, and since the quickstart doesn't use that ENV, nor does the README instructions mention it in the Develop section, the image is built locally but the compose config still pulls from DockerHub 😅 So that's another "bug" observed.


Reference - Publishing image via GHA instead

For reference this is what publishing to image registries looks like with Github Actions:

# Tag an image with semver pattern (useful for tools that understand the tag convention),
# optionally with a tag rule that reflects your primary git branch builds for testing:
- name: 'Prepare tags'
  id: prep
  uses: docker/[email protected]
  with:
    images: |
      ${{ secrets.DOCKER_REPOSITORY }}
      ${{ secrets.GHCR_REPOSITORY }}
    tags: |
      type=edge,branch=master
      type=semver,pattern={{major}}
      type=semver,pattern={{major}}.{{minor}}
      type=semver,pattern={{major}}.{{minor}}.{{patch}}


# Two registries to publish to configured (DockerHub and GHCR):
- name: 'Login to DockerHub'
  uses: docker/login-action@v3
  with:
    username: ${{ secrets.DOCKER_USERNAME }}
    password: ${{ secrets.DOCKER_PASSWORD }}
- name: 'Login to GitHub Container Registry'
  uses: docker/login-action@v3
  with:
    registry: ghcr.io
    username: ${{ github.actor }}
    password: ${{ secrets.GITHUB_TOKEN }}


# Build the image from a Dockerfile for the requested platforms,
# then publish a multi-platform image at the registries with the tags declared above
- name: 'Build and publish images'
  uses: docker/[email protected]
  with:
    context: .
    # Customize any Dockerfile ARG values here:
    build-args: |
      DMS_RELEASE=${{ github.ref_type == 'tag' && github.ref_name || 'edge' }}
      VCS_REVISION=${{ github.sha }}
    platforms: linux/amd64,linux/arm64
    push: true
    tags: ${{ steps.prep.outputs.tags }}
    # Disable provenance attestation: https://docs.docker.com/build/attestations/slsa-provenance/
    provenance: false

That's all there is to it really, works well! (I omitted a little bit from the link referenced for brevity)

Github Actions also has reusable workflows, allowing you to setup a template with the above steps and pass some input variables in to support the different Dockerfile and tag variants for example. You can see an example of that here.

Perhaps GoReleaser is simpler than that, but the reference I've provided with official Github Actions from Docker doesn't have that same publishing weirdness at DockerHub for mailserver/docker-mailserver tags pushed.

@polarathene polarathene requested a review from a team as a code owner January 5, 2025 22:25
@polarathene
Copy link
Author

Documenting some more feedback should it appeal to any maintainers that want to pursue it.

A Dockerfile with flexible sourcing of the Hydra binary

I only mention this since your currently maintaining both approaches via separate Dockerfiles.

Dockerfile-distroless-static is effectively the last stage of Dockerfile-build, but without the /var/lib/sqlite directory.

If you want to have a Dockerfile that can build an image with the hydra binary COPY supporting both the builder stage, or a copy from context (supplied externally), you could have this with in the same Dockerfile as different stages that you select by a build target option instead of specifying different files.

FROM runner AS provide-hydra
COPY hydra /usr/bin/hydra

FROM runner AS build-hydra
COPY --from=builder /usr/bin/hydra /usr/bin/hydra

As the last stage will be the default target, it would build Hydra with the image build by default. To use the external build of Hydra instead just use:

docker build --target provide-hydra --file Dockerfile-build --tag oryd/hydra:latest-distroless .

A Dockerfile that builds with configurable Alpine/Distroless base image

The FROM runner stage reference could be an ARG to switch between the Alpine an Debian supported image bases, which would also remove the separate Dockerfile-alpine.

Roughly, after the builder stage in Dockerfile-build, you could have the following:

# NOTE: ARG referenced via FROM must be defined at the top of the Dockerfile
# Place this above the builder stage
ARG BASE_IMAGE=alpine


# These stages setup the Alpine and Debian base images for the runtime stage:
FROM alpine:3.20 AS alpine
RUN apk add --no-cache ca-certificates
FROM gcr.io/distroless/static-debian12:nonroot AS distroless


# The runtime stage defaults to Alpine, to build the distroless variant instead use:
# `docker build --build-arg BASE_IMAGE=distroless`
FROM ${BASE_IMAGE} AS runtime
RUN --mount=from=busybox:latest,src=/bin/,dst=/bin/ mkdir -p /var/lib/sqlite
ENTRYPOINT ["hydra"]
CMD ["serve", "all"]


# Final stage decides where to retrieve the hydra binary from.
# Defaults to `build-hydra`, provide your own build via
# `docker build --target provide-hydra`
FROM runner AS provide-hydra
COPY hydra /usr/bin/hydra
FROM runner AS build-hydra
COPY --from=builder /usr/bin/hydra /usr/bin/hydra

All 4 image variants can then be built:

docker build --build-arg BASE_IMAGE=alpine --target build-hydra
docker build --build-arg BASE_IMAGE=alpine --target provide-hydra
docker build --build-arg BASE_IMAGE=distroless --target build-hydra
docker build --build-arg BASE_IMAGE=distroless --target provide-hydra

Personally though, if you're doing static builds of Hydra outside of the Dockerfile to copy into the image, if this is only for some runtime purpose like development, you could just use a volume mount instead?

If the builder stage were always used for copying the hydra binary, then the prior/existing /var/lib/sqlite dir creation approach is valid (instead of the busybox approach to support mkdir with distroless):

COPY --from=builder /var/lib/sqlite /var/lib/sqlite

You can also use a FROM scratch stage if you'd want the builder to output the binary to the host instead of an image via --output:

FROM scratch AS export-hydra
COPY --from=builder /usr/bin/hydra /hydra
# The image contents is exported to the local filesystem:
docker buildx build --output ./ --target export-hydra --file Dockerfile-build .

That'd give you the inverse of copying an external build into the image if you found it more convenient to build with Docker and export the binary for releases elsewhere.

Dockerfile-hsm improvements

Likewise for Dockerfile-hsm, the -tags for Go need to be adjusted/appended but otherwise differences are minimal:

# Use the same `builder` stage from `Dockerfile-build`
# NOTE: this might require `go test` to also include the arg `-ldflags="-extldflags=-static"`?
# Modify the `builder` stage likewise use the `GO_BUILD_TAGS` ARG too
# Define the ARG above the builder stage to assign a common default
ARG GO_BUILD_TAGS=sqlite,sqlite_omit_load_extension


FROM builder AS test-hsm-base
ENV HSM_ENABLED=true
ENV HSM_LIBRARY=/usr/lib/softhsm/libsofthsm2.so
ENV HSM_TOKEN_LABEL=hydra
ENV HSM_PIN=1234

RUN apt-get -y install softhsm2 opensc
RUN pkcs11-tool --module "$HSM_LIBRARY" --slot 0 --init-token --so-pin 0000 --init-pin --pin "$HSM_PIN" --label "$HSM_TOKEN_LABEL"

FROM test-hsm-base AS test-hsm
ARG GO_BUILD_TAGS
RUN go test -p 1 -failfast -short -tags=${GO_BUILD_TAGS},hsm ./...

FROM test-hsm-base AS test-hsm-refresh
ARG GO_BUILD_TAGS UPDATE_SNAPSHOTS=true
RUN go test -p 1 -failfast -short -tags=${GO_BUILD_TAGS},hsm,refresh ./...

Much smaller/simpler to maintain like that.

The GO_BUILD_TAGS ARG could be avoided, but it might help to centralize on it, plus you can easily adjust them at build time for all image variants this way.

# Now you can build via target:
docker build --target test-hsm
docker build --target test-hsm-refresh

# And for the final stage of `Dockerfile-hsm` (but actually works),
# just set the build arg to that HSM stage:
docker build --build-arg BASE_IMAGE=test-hsm-base

While this could allow for a single Dockerfile for every image this repo builds, I don't really encourage that. Since the HSM images are only for internal use, you could just perform an extra build command to get the builder stage built as a separate image, before the Dockerfile-hsm references that via FROM (or you could leverage Bake config file to keep it as a single build command).

Regardless, Dockerfile-hsm could be simplified with the above approach. You could also throw away the final stage since it doesn't actually build and apparently isn't used anywhere anyway?

Copy link
Author

@polarathene polarathene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies about all the verbosity, I don't really want to think about all this again and move onto other things instead 😂

Here's some final feedback to consider. I understand all my input could be overwhelming so I'll wrap this up with a checklist summary and links in a follow-up comment that should help make this more digestible for anyone interested.

# NOTE: This is required for read/write by SQLite.
install --owner ory --group ory --directory /var/lib/sqlite

# NOTE: Presumably this was already created by the prior RUN directive
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# NOTE: Presumably this was already created by the prior RUN directive
# NOTE: This location was created by `softhsm2` package install with ownership `root:softhsm`,
# while the `pkcs11-tool` command generated a token dir + files with permissions of 600.
# To support running hydra in the container as the non-root ory user, grant ownership for RW:

/var/lib/softhsm/tokens is created when installing the softhsm / softhsm2 package. Ownership is root:softhsm. As is the token dir/files generated by the earlier pkcs11-tool command.

Rather than chown -R this directory and contents, it'd be wiser to just add ory to the softhsm group. However, the token generated only allows access to the file owner (root):

$ ls -l /var/lib/softhsm
drwxrws--- 2 root softhsm 4096 Apr  1  2024 tokens

$ ls -l /var/lib/softhsm/tokens
drwx--S--- 2 root softhsm 4096 Jan  5 22:34 c0e04acd-3e15-0266-6003-f9394edce34b

$ ls -l /var/lib/softhsm/tokens/c0e04acd-3e15-0266-6003-f9394edce34b
-rw------- 1 root softhsm   8 Jan  5 22:34 generation
-rw------- 1 root softhsm   0 Jan  5 22:34 token.lock
-rw------- 1 root softhsm 320 Jan  5 22:34 token.object

So in this case chown is required for the owner specifically. Probably shouldn't mess with the group ownership? 🤷‍♂️


Personally unless there's a clear reason for building an image to run with a non-root user, I'd suggest keeping it simple to maintain and just using root which avoids all this.

Anyone bind mounting a volume for local storage to the container is not going to have the ory UID/GID match their host user since regardless of the previous implicit UID/GID (100:101) or the static 500:500 proposed here, the typical non-root host user is 1000:1000.

That leaves the remaining benefit as a "best practice", but AFAIK is mostly moot if the user instead runs the container with Podman/Docker in rootless mode which uses /etc/subuid + /etc/subgid tables on the host to map the containers root UID + GID to the hosts user and any other container UID/GID values to the subuid / subgid range. Alternatively, when run as rootful, the container can be run with all capabilities dropped, which is implicit when running a process as a non-root user in the container.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally unless there's a clear reason for building an image to run with a non-root user, I'd suggest keeping it simple to maintain and just using root which avoids all this.

That leaves the remaining benefit as a "best practice", but AFAIK is mostly moot if the user instead runs the container with Podman/Docker in rootless mode which uses /etc/subuid + /etc/subgid tables on the host to map the containers root UID + GID to the hosts user and any other container UID/GID values to the subuid / subgid range. Alternatively, when run as rootful, the container can be run with all capabilities dropped, which is implicit when running a process as a non-root user in the container.

[...] mostly moot if the user instead [...]

security is never moot if you assume that everyone is smart or experienced enough to do the right thing. Running as root will fail several CI checks, please revert the user changes made

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhh, I generally don't see people understanding non-root practice in containers, just pushing for it as a security practice regardless of knowing why.

As I've said, you'll have default capabilities granted to the container root user, but these aren't the same as host root (full caps). Switching to a non-root user just drops these additional caps, which if you really need to you can do so yourself as a security practice explicitly (dropping all caps, or when compatible, switching the user to non-root if you really insist).

The other reason to use non-root is if container escape is accomplished, then escaping as the root user to be root on the host would allow more damage, but this doesn't make non-root in a container exempt from this, you can still escape as non-root and become root on the host depending on the vulnerability exploited. The bulk of vulnerabilities require granting capabilities or relaxing security intentionally for a container than would otherwise be permitted by default, hence why I'm not really sold on the non-root container user "best practice".

I've seen a variety of containers that adopted non-root as a security practice, but then did workarounds that made security worse. Some granted the non-root user access to the docker socket, others granted capabilities that wouldn't be allowed by default, or mandated capabilities to support non-root but in such a way that prevented someone who is not using a feature that requires that capability and knows how to properly lock down a container to be forced to grant the capability for the binary to run...

To assume that building a container that runs as non-root is always more secure would be a mistake. Inexperienced users that make a container insecure can happen regardless too, trying to address incompetence with a blanket "fix" which often introduces more caveats, especially when the image fails to document this practice is ill-advised IMO. If you're going to defend the practice, have a good reason beyond parroting / hand-waving the justification.

More security conscious (even those inexperienced) should be using rootless Docker/Podman, where it's perfectly ok to have the container user as root, since escaping the container does not result in retaining the same UID/GID of the container user thanks to user namespace mapping.

Root in a container is not equivalent to root on the host. I've explained and demonstrated this so many times in the past to those that don't know better but parrot paranoia insisting that they know what they're talking about.


Running as root will fail several CI checks, please revert the user changes made

I'm not sure what CI checks you were referring to, but your response is to a comment that proposed dropping the non-root approach. The PR respected the non-root decision, all it did here was ensure consistency with the other images for non-root user setup, thus nothing to revert.

Comment on lines +7 to +15
# Add a user/group for Ory with a stable UID + GID:
# NOTE: This only appears relevant for supporting hydra as non-root, otherwise unnecessary.
addgroup --system --gid 500 ory
adduser --system --uid 500 \
--gecos "Ory User" \
--home /home/ory \
--ingroup ory \
--shell /sbin/nologin \
ory
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proposal to consider removing this in future follow-up PR.

  • Only seems relevant as a common "best practice" that some users request. It has it's fair share of caveats though, especially when not clearly communicated/documented for the image.
  • The only known dependency on this is optional SQLite for a fixed DSN location at /var/lib/sqlite, but this becomes redundant if the container has a bind mount volume, which again is not documented for clarity, but a common practice for persistence vs named data volumes.
  • Without any other support needed in the container, a user that wants this could better run the container with the user option to set the user to switch to and gain the same benefits.
  • If Hydra ever needs to use some linux capabilities, you'd need to add a step with setcap to grant them for non-root users, and ideally at runtime in Go handle checking this to raise the capability rather than enforce it via setcap (especially if the feature using it is optional). With root (including rootless Docker/Podman), the need for setcap is avoided. Keep this simple to maintain, since historically in this repo the Docker support is already showing inconsistencies in maintenance, mostly due to redundant noise for running as a non-root user by default.

Comment on lines +17 to +22
# Create the sqlite directory with ownership to that user and group:
# NOTE: This is required for read/write by SQLite.
# - Path may be a default value somewhere, or only explicitly provided via DSN?
# - Owner/Group is only relevant to permissions allowing the hydra process to read/write to the location.
# - Bind mount volumes will replace the ownership with that of the host directory, requiring correction.
install --owner ory --group ory --directory /var/lib/sqlite
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added commentary here, but from what I understand this path is only relevant when it's part of the DSN configured like in quickstart.yml? While ownership is only adjusted to support the non-root ory user to have RW access.

Some platforms like OpenShift have a feature that uses the root group for access while randomizing the UID the container runs as. I'm not entirely sure how viable that approach is and raised some concerns at another repo (caddy-docker) where someone proposed supporting that.

However, group/other permissions need to be adjusted for the DSN path regardless so that the hydra process has write access to create the DB (the execute bit on the directory is also required for that operation), thus 770 (rwxrwx---) is necessary.

services:
  hydra:
    image: localhost/hydra:v2.2.0
    # No delay in shutdown (uses tini as PID 1), proper signal forwarding + reaping:
    init: true
    # Optional: Run as non-root user, but use the root group for /var/lib/sqlite to avoid needing a chown
    user: 500:0
    environment:
      # DSN to SQLite:
      # https://www.ory.sh/docs/self-hosted/deployment#sql-persistent
      DSN: sqlite:///var/lib/sqlite/db.sqlite?_fk=true
    # Unlike the current quickstart.yml example which relies upon the Docker Compose `depends_on`
    # feature to apply DB migrations through another container instance, just run both commands here:
    # NOTE: While the quickstart doesn't mention it, `migrate sql` should technically
    # have you manually create a backup copy prior to running it (take caution with restart policy if automating):
    # https://www.ory.sh/docs/hydra/self-hosted/dependencies-environment
    entrypoint: ["ash", "-c"]
    command:
      - |
        hydra --config /etc/hydra/config.yml migrate sql --read-from-env --yes
        hydra --config /etc/hydra/config.yml serve all --dev --sqa-opt-out
    # Build image locally (no need to try pull from a remote registry):
    pull_policy: build
    build:
      dockerfile_inline: |
        FROM alpine
        RUN install --mode 770 --directory /var/lib/sqlite
        COPY --from=oryd/hydra:v2.2.0 /usr/bin/hydra /usr/bin/hydra
        # `--chmod` only required when container user is not run as root,
        # Volume mounting the config instead is fine as `git clone` permissions use umask which should result in 644.
        ADD --chmod=640 https://raw.githubusercontent.com/ory/hydra/refs/heads/master/contrib/quickstart/5-min/hydra.yml /etc/hydra/config.yml

So that works well, and if the user bind mounts a host directory instead, they can just set the user field to match UID of their host user. This avoids needing to set permissions of /var/lib/sqlite to 770, or know the UID/GID for the containers ory user/group is so that they can chmod their host directory ownership to that for compatibility 👍 (both inconvenient)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! The dockerfile_inline would probably need to build the binary inline as well, otherwise the binary type might mismatch (osx versus linux for example)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otherwise the binary type might mismatch (osx versus linux for example)

I don't have macOS environment to verify, but the compose example above is just for a quick example. It copies the binary from the platform image it has, which AFAIK there is no macOS platform for registries/containers, they only publish windows and linux images?

Unless you were referring to architecture like AMD64 vs ARM64? That shouldn't be an issue, the directive should be using the inferred platform for selecting the image to copy from. Equivalent to FROM oryd/hydra or docker pull oryd/hydra without --platform= specified for either to change the default.


The example was focused on showing how much simpler the image is without catering to a non-root user, allowing the deployment to explicitly opt-in to non-root by choosing their own UID/GID, instead of needing to be mindful of what was chosen internally.

That's possible since the image doesn't have any real dependency lock-in on a non-root user being configured. They could just as easily take the current non-root images and run them as root in the same manner.


USER ory
COPY hydra /usr/bin/hydra
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Context is lacking a bit for these COPY instructions.

Since the image itself doesn't have a build stage, it's rather vague what the hydra binary was built with. I doubt it matters for most, but since this is used for the official image published to DockerHub, some context might be worthwhile.

I assume going forward both Dockerfile-alpine + Dockerfile-distroless will take the binary built via Dockerfile-build, but perhaps you typically build Hydra outside of a container? 🤷‍♂️

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outside of docker compose and make docker, the binaries are built by goreleaser and then injected into the containers. So this is correct here.


ENTRYPOINT ["hydra"]
CMD ["serve", "all"]
USER ory
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As covered by earlier review comments, this could be dropped in a follow-up PR.

Technically a breaking change (but so is the 500:500 UID/GID proposed in this PR), but all a user should need to do is adjust the ownership of the /var/lib/sqlite or run with --user 101:101 (might differ by image depending on implicit UID/GID assigned to ory).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't agree with this change, the hydra command should run as non-root in my view

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no suggestion snippet to remove the line. It was simply shifted to the end of the file. A change to drop it would be:

Suggested change
USER ory

My comment only proposed dropping non-root, and noted that this PR configures the non-root user with a stable UID/GID value, rather than relying on it implicitly (unstable as any package installed prior may affect user/groups causing the UID/GID for ory to change unexpectedly).

I then noted that the stable UID/GID change itself, would be a breaking change to existing users of the image going forward, similar to if changing from non-root to root.


Since there's no real dependency on the non-root user in the image, it's fine and someone can choose to explicitly run as root at runtime (which should be secure, especially via rootless container).

Comment on lines +1 to +4
# TODO: Remove this file in favor of distroless-static variant:
# https://github.com/ory/hydra/blob/master/.docker/Dockerfile-distroless-static
# However if published to any registry, continue to publish the variant tag but as an alias to `-distroless` tags:
# https://github.com/ory/hydra/pull/3914#pullrequestreview-2527315326
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is redundant AFAIK, I don't know where it's being used/published, so proposed changes for consistency remain here and the file could be dropped in a follow-up PR, reverting the change if required.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked and we no longer distribute this scratch image but instead the distroless variant. It should be OK to be removed.

Comment on lines +1 to +4
# TODO: Remove this file in favor of the main/default Alpine image. The sqlite package is no longer required:
# https://github.com/ory/hydra/blob/master/.docker/Dockerfile-alpine
# However if published to any registry, continue to publish the variant tag but as an alias to standard Alpine image tags:
# https://github.com/ory/hydra/pull/3914#pullrequestreview-2527315326
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is redundant AFAIK, I don't know where it's being used/published, so proposed changes for consistency remain here and the file could be dropped in a follow-up PR, reverting the change if required.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove it, I also could not find any use for it.

@polarathene
Copy link
Author

Overview of everything above

This comment should summarize everything with links for more details if you need them.

Changes from this PR

This PR applies the following changes:

  • Refactor Alpine images to be consistent with directives and common shared logic.
    • RUN directive now uses HereDocs feature (requires Docker v23 or DOCKER_BUILDKIT=1 if older docker engine).
    • 😎 Install ca-certificates + create ory user+group (with UID/GID pinned for consistency)
    • 😎 /var/lib/sqlite directory created via install command (effectively mkdir + chown)
    • 🚀 Remove legacy nsswitch.conf workaround
    • 🚀 Remove legacy VOLUME directive
    • 👀 Realization that Dockefile-scratch + Dockerfile-sqlite are redundant and could be deleted in future.
    • ⚠️ Realization that final stage of Dockerfile-hsm is broken, but stage isn't used (removal in future proposed, see TODOs below for more info).
  • USER directive shifted to end of each Dockerfile. Preferring non-root likely causes more issues than benefits (removal in future proposed, see TODOs below for more info).
  • Additional commentary. Some of it could be dropped, but adding directly into the source should see that it gets addressed more promptly or future maintenance / readers have better context to reference.

Overall if we strip the commentary and ignore minor differences, Dockerfile-alpine, Dockerfile-scratch, Dockerfile-sqlite and Dockerfile-hsm are all normalized to roughly this same common snippet:

RUN <<HEREDOC
    apk upgrade --no-cache
    apk add --no-cache --upgrade ca-certificates

    addgroup --system --gid 500 ory
    adduser --system --uid 500 \
      --gecos "Ory User" \
      --home /home/ory \
      --ingroup ory \
      --shell /sbin/nologin \
      ory

    install --owner ory --group ory --directory /var/lib/sqlite
HEREDOC

COPY hydra /usr/bin/hydra
ENTRYPOINT ["hydra"]
CMD ["serve", "all"]
USER ory

While addressing the two referenced issues at the start of the PR description (removal of legacy nsswitch.conf and VOLUME directive).

In future if dropping the non-root ory user, the Alpine and Distroless images effectively become this:

FROM alpine:3.20
RUN apk add --no-cache ca-certificates && mkdir -p /var/lib/sqlite

COPY hydra /usr/bin/hydra
ENTRYPOINT ["hydra"]
CMD ["serve", "all"]
FROM gcr.io/distroless/static-debian12:nonroot
RUN --mount=from=busybox:latest,src=/bin/,dst=/bin/ mkdir -p /var/lib/sqlite

COPY hydra /usr/bin/hydra
ENTRYPOINT ["hydra"]
CMD ["serve", "all"]

All the verbosity I've provided is towards instilling confidence to pursue that change via a follow-up PR.

Additional context

  • The DockerHub registry only publishes two images, Alpine (default) + Debian (-distroless suffix). HSM variant is only for testing. Each Dockerfile has COPY for the hydra externally, except for Dockerfile-build (although this does not appear to be used for publishing the -distroless images?)
  • Dockerfile-alpine + Dockerfile-build are the only relevant images to retain / revise. The HSM support is minor changes, see TODOs below for future work.
  • I've not updated any other Dockerfile like the conformance test, which may warrant adopting the same changes for easier maintenance (it's effectively -distroless variant but with custom cert/CA added, there's better ways to approach that). Side-note other Dockerfiles for testing are over 5 years old with base images, those should probably get updated.

Future PRs / tasks

Related TODOs from observations while tackling this PR, not to be resolved by this PR:

  • E2E test in CI is failing for HSM. This PR doesn't influence that, it's a difference with the CI running Ubuntu while the Docker images use Debian which has the softhsm package (Issue).
  • DockerHub (and any other registry that might be published to?) has misconfigured publishing config (Issue, Ref).
  • Images presently published (v2.2.0) have a dynamically linked hydra binary for Alpine with sqlite support, while the Distroless variant lacks sqlite support (unless switching to Dockerfile-build like make docker uses to produce a -sqlite image tag). Both images should be published with the same static binary (Ref, see minor inconsistency section).
  • DX concerns with docs related to Docker and Development guidance for contributors (Issue, Issue 2).
  • Follow-up PR should consider dropping non-root, it doesn't seem necessary to maintain, nor properly documented to the image consumer (Ref 1, Ref 2, Ref 3, Ref 4).
  • Follow-up PR could delete a few redundant Dockerfile variants (Dockerfile-sqlite, Dockerfile-scratch, Dockerfile-distroless-static), which along with defaulting to root as the container user could potentially cover all images in a small single Dockerfile (or two if keeping HSM separate) (Ref 1. Ref 2, Ref 3, Ref 4, Ref 5, Ref 6).
  • The HSM image final stage is broken. Apparently not used, so could be dropped (Ref 1, Ref 2).
  • GO111MODULE=on is no longer relevant since Go 1.16 (Feb 2021), ignored since Go 1.17. You should probably remove GO111MODULE from the various files in this repo.

@aeneasr
Copy link
Member

aeneasr commented Jan 8, 2025

The HSM image final stage is broken. Apparently not used, so could be dropped (#3914 (comment), #3914 (comment)).

Unfortunately, that stage is used for CI tests and is causing CI failures

@aeneasr aeneasr mentioned this pull request Jan 10, 2025
7 tasks
Copy link
Author

@polarathene polarathene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this PR is now considered redundant as you've raised a replacement PR, I'm wrapping it up with feedback to your last review comments.

No need to respond to the feedback unless you want to engage in that discussion further. I respect your decision to keep non-root if you choose to (users tend to request it anyway).

Apologies for the late response, busy week.

# NOTE: This is required for read/write by SQLite.
install --owner ory --group ory --directory /var/lib/sqlite

# NOTE: Presumably this was already created by the prior RUN directive
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhh, I generally don't see people understanding non-root practice in containers, just pushing for it as a security practice regardless of knowing why.

As I've said, you'll have default capabilities granted to the container root user, but these aren't the same as host root (full caps). Switching to a non-root user just drops these additional caps, which if you really need to you can do so yourself as a security practice explicitly (dropping all caps, or when compatible, switching the user to non-root if you really insist).

The other reason to use non-root is if container escape is accomplished, then escaping as the root user to be root on the host would allow more damage, but this doesn't make non-root in a container exempt from this, you can still escape as non-root and become root on the host depending on the vulnerability exploited. The bulk of vulnerabilities require granting capabilities or relaxing security intentionally for a container than would otherwise be permitted by default, hence why I'm not really sold on the non-root container user "best practice".

I've seen a variety of containers that adopted non-root as a security practice, but then did workarounds that made security worse. Some granted the non-root user access to the docker socket, others granted capabilities that wouldn't be allowed by default, or mandated capabilities to support non-root but in such a way that prevented someone who is not using a feature that requires that capability and knows how to properly lock down a container to be forced to grant the capability for the binary to run...

To assume that building a container that runs as non-root is always more secure would be a mistake. Inexperienced users that make a container insecure can happen regardless too, trying to address incompetence with a blanket "fix" which often introduces more caveats, especially when the image fails to document this practice is ill-advised IMO. If you're going to defend the practice, have a good reason beyond parroting / hand-waving the justification.

More security conscious (even those inexperienced) should be using rootless Docker/Podman, where it's perfectly ok to have the container user as root, since escaping the container does not result in retaining the same UID/GID of the container user thanks to user namespace mapping.

Root in a container is not equivalent to root on the host. I've explained and demonstrated this so many times in the past to those that don't know better but parrot paranoia insisting that they know what they're talking about.


Running as root will fail several CI checks, please revert the user changes made

I'm not sure what CI checks you were referring to, but your response is to a comment that proposed dropping the non-root approach. The PR respected the non-root decision, all it did here was ensure consistency with the other images for non-root user setup, thus nothing to revert.


ENTRYPOINT ["hydra"]
CMD ["serve", "all"]
USER ory
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no suggestion snippet to remove the line. It was simply shifted to the end of the file. A change to drop it would be:

Suggested change
USER ory

My comment only proposed dropping non-root, and noted that this PR configures the non-root user with a stable UID/GID value, rather than relying on it implicitly (unstable as any package installed prior may affect user/groups causing the UID/GID for ory to change unexpectedly).

I then noted that the stable UID/GID change itself, would be a breaking change to existing users of the image going forward, similar to if changing from non-root to root.


Since there's no real dependency on the non-root user in the image, it's fine and someone can choose to explicitly run as root at runtime (which should be secure, especially via rootless container).

Comment on lines +17 to +22
# Create the sqlite directory with ownership to that user and group:
# NOTE: This is required for read/write by SQLite.
# - Path may be a default value somewhere, or only explicitly provided via DSN?
# - Owner/Group is only relevant to permissions allowing the hydra process to read/write to the location.
# - Bind mount volumes will replace the ownership with that of the host directory, requiring correction.
install --owner ory --group ory --directory /var/lib/sqlite
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otherwise the binary type might mismatch (osx versus linux for example)

I don't have macOS environment to verify, but the compose example above is just for a quick example. It copies the binary from the platform image it has, which AFAIK there is no macOS platform for registries/containers, they only publish windows and linux images?

Unless you were referring to architecture like AMD64 vs ARM64? That shouldn't be an issue, the directive should be using the inferred platform for selecting the image to copy from. Equivalent to FROM oryd/hydra or docker pull oryd/hydra without --platform= specified for either to change the default.


The example was focused on showing how much simpler the image is without catering to a non-root user, allowing the deployment to explicitly opt-in to non-root by choosing their own UID/GID, instead of needing to be mindful of what was chosen internally.

That's possible since the image doesn't have any real dependency lock-in on a non-root user being configured. They could just as easily take the current non-root images and run them as root in the same manner.

@polarathene
Copy link
Author

Unfortunately, that stage is used for CI tests and is causing CI failures

If that were the case could you point out where. Pretty sure I identified the CI failure was unrelated to the Docker image or stage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants